我们介绍了NeuralVDB,它通过利用机器学习的最新进步来提高现有的行业标准,以有效地存储稀疏体积数据,表示VDB。我们的新型混合数据结构可以通过数量级来减少VDB体积的内存足迹,同时保持其灵活性,并且只会产生一个小(用户控制的)压缩误差。具体而言,NeuralVDB用多个层次神经网络替换了浅和宽VDB树结构的下节点,这些神经网络分别通过神经分类器和回归器分别编码拓扑和价值信息。这种方法已证明可以最大化压缩比,同时保持高级VDB数据结构提供的空间适应性。对于稀疏的签名距离字段和密度量,我们已经观察到从已经压缩的VDB输入中的$ 10 \ times $ $ $ \ $ 100 \ $ 100 \ $ 100 \ $ 100 \ $ 100的压缩比,几乎没有可视化伪像。我们还展示了其在动画稀疏体积上的应用如何加速训练并产生时间连贯的神经网络。
translated by 谷歌翻译
In modern on-driving computing environments, many sensors are used for context-aware applications. This paper utilizes two deep learning models, U-Net and EfficientNet, which consist of a convolutional neural network (CNN), to detect hand gestures and remove noise in the Range Doppler Map image that was measured through a millimeter-wave (mmWave) radar. To improve the performance of classification, accurate pre-processing algorithms are essential. Therefore, a novel pre-processing approach to denoise images before entering the first deep learning model stage increases the accuracy of classification. Thus, this paper proposes a deep neural network based high-performance nonlinear pre-processing method.
translated by 谷歌翻译
推荐系统为用户提供了最新在线大量信息的适当限制。基于会话的建议是推荐系统的子区域,试图通过解释由项目序列组成的会话来推荐项目。最近,在这些会话中包括用户信息的研究是进步。但是,很难生成包括用户生成的会话表示的高质量用户表示。在本文中,我们考虑了通过异质注意网络创建的图表中的各种关系。约束还迫使用户表示会考虑会话中介绍的用户偏好。它试图通过培训过程中的其他优化来提高性能。所提出的模型在各种现实世界数据集上的其他方法优于其他方法。
translated by 谷歌翻译
异构图形神经网络可以代表具有优异能力的异质图的信息。最近,研究了自我监督的学习方式,通过对比学习方法来学习图形的独特表达。在没有标签的情况下,这种学习方法显示出很大的潜力。然而,对比度学习严重依赖于正面和负对对,并且从异质图中产生高质量的对是困难的。在本文中,符合最近的自我监督学习的创新,我们介绍了一个可以产生良好的表示而不产生大量成对的创新。此外,请注意可以从该过程中的两个视角看异构图形的事实,捕获图中的高级表达式并表达。所提出的模型显示出最先进的性能,而不是各种真实世界数据集中的其他方法。
translated by 谷歌翻译
通常,图形神经网络(GNN)一直在使用消息传递方法来聚合和总结关于邻居的信息来表达他们的信息。尽管如此,之前的研究表明,由于该消息传递方法,当附近存在异常的节点时,图形神经网络的性能变得易受攻击。在本文中,由神经结构搜索方法的启发,我们介绍了一种识别异常节点的算法,并自动从信息聚合中排除它们。各种现实世界数据集的实验表明,我们所提出的神经结构搜索的异常电阻图神经网络(NASAR-GNN)实际上是有效的。
translated by 谷歌翻译
The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.
translated by 谷歌翻译
Cellular automata (CA) captivate researchers due to teh emergent, complex individualized behavior that simple global rules of interaction enact. Recent advances in the field have combined CA with convolutional neural networks to achieve self-regenerating images. This new branch of CA is called neural cellular automata [1]. The goal of this project is to use the idea of idea of neural cellular automata to grow prediction machines. We place many different convolutional neural networks in a grid. Each conv net cell outputs a prediction of what the next state will be, and minimizes predictive error. Cells received their neighbors' colors and fitnesses as input. Each cell's fitness score described how accurate its predictions were. Cells could also move to explore their environment and some stochasticity was applied to movement.
translated by 谷歌翻译
There is a dramatic shortage of skilled labor for modern vineyards. The Vinum project is developing a mobile robotic solution to autonomously navigate through vineyards for winter grapevine pruning. This necessitates an autonomous navigation stack for the robot pruning a vineyard. The Vinum project is using the quadruped robot HyQReal. This paper introduces an architecture for a quadruped robot to autonomously move through a vineyard by identifying and approaching grapevines for pruning. The higher level control is a state machine switching between searching for destination positions, autonomously navigating towards those locations, and stopping for the robot to complete a task. The destination points are determined by identifying grapevine trunks using instance segmentation from a Mask Region-Based Convolutional Neural Network (Mask-RCNN). These detections are sent through a filter to avoid redundancy and remove noisy detections. The combination of these features is the basis for the proposed architecture.
translated by 谷歌翻译
Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.
translated by 谷歌翻译
In this paper, we learn a diffusion model to generate 3D data on a scene-scale. Specifically, our model crafts a 3D scene consisting of multiple objects, while recent diffusion research has focused on a single object. To realize our goal, we represent a scene with discrete class labels, i.e., categorical distribution, to assign multiple objects into semantic categories. Thus, we extend discrete diffusion models to learn scene-scale categorical distributions. In addition, we validate that a latent diffusion model can reduce computation costs for training and deploying. To the best of our knowledge, our work is the first to apply discrete and latent diffusion for 3D categorical data on a scene-scale. We further propose to perform semantic scene completion (SSC) by learning a conditional distribution using our diffusion model, where the condition is a partial observation in a sparse point cloud. In experiments, we empirically show that our diffusion models not only generate reasonable scenes, but also perform the scene completion task better than a discriminative model. Our code and models are available at https://github.com/zoomin-lee/scene-scale-diffusion
translated by 谷歌翻译